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WHAT IS CLAIMED : 

1 . A method for characterizing regulatory sequences associated with a genetic 
locus, said method comprising the steps of: 

(a) providing a sample containing nuclear chromatin; 

(b) treating said sample with an agent that induces modifications in DNA at 
hypersensitivity sites; and 

(c) identifying the DNA hypersensitivity sites induced by the agent; thereby 
generating an regulatory sequence profile associated with the genetic locus. 

2. The method of claim 1 , wherein the regulatory sequence profile comprises 
nucleotide sequences of said DNA hypersensitivity sites and the locations 
thereof within the genetic locus. 

3. The method of claim 1, wherein the genetic locus comprises the coding region 
for at least one expressed gene. 

4. The method of claim 3, wherein the gene is a known gene. 

5. The method of claim 3, wherein the gene is associated with a disease state. 

6. The method of claim 5, wherein the disease state is a cancer. 

7. The method of claim 5, wherein the gene is selected from the group consisting 
of: p53, Rb, ENK4A/pl6, CTNNB1, H-Ras, Fos, MDM2, INK4, ARF1, 
PTEN, Jun, WNT3A/14, NFkB, TERT, BRCA1, BRCA2, WAFl/p21, CDK4, 
TGF-betal, FAR, E2F, VHL, MLH1, SMAD4, SMAD2, SMAD3, K-Ras, 
EGFR, WT1, Myc, Raf, ABL, and HER2. 

8. The method of claim 1, wherein the genetic locus comprises greater than about 
1 kb of DNA. 



157 



WO 2004/053106 



PCT/US2003/040070 



9. The method of claim 1, wherein the genetic locus comprises greater than about 
10 kb of DNA. 

10. The method of claim 1, wherein the genetic locus comprises greater than about 25 

kb of DNA. 

1 1 . The method of claim 1, wherein the genetic locus comprises greater than about 
50 kb of DNA. 

12. The method of claim 1, wherein the genetic locus comprises greater than about 
100 kb of DNA. 

13. The method of claim 1, wherein the genetic locus comprises about 1 to 100 kb 
of DNA. 

14. The method of claim 1, wherein the genetic locus comprises about 25 to 75 kb 
of DNA. 

15. The method of claim 1, wherein the genetic locus comprises about 50 to 100 
kb of DNA. 

16. The method of claim 1, wherein the step of identifying regulatory sequences 
associated with said genetic locus is performed by a plurality of polymerase 
chain reactions. 

17. The method of claim 16, wherein the polymerase chain reactions employ 
primers that amplify products spanning substantially the entirety of the genetic 
locus. 

18. The method of claim 17, wherein said products comprise DNA sequences 
having lengths between about 100 and 1000 base pairs. 
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The method of claim 17, wherein said products comprise DNA sequences 
having lengths between about 100 and 500 base pairs. 

The method of claim 17, wherein said products comprise DNA sequences 
having lengths between about 200 and 300 base pairs. 

The method of claim 1, wherein said agent that induces modifications in DNA 
at hypersensitivity sites is selected from the group consisting of radiation, a 
chemical agent, an enzyme, and combinations thereof. 

The method of claim 21, wherein the radiation comprises UV light radiation. 

The method of claim 21, wherein the chemical agent is a clastogen. 

The method of claim 21, wherein the enzyme is selected from the group 
consisting of specific endonucleases, non-specific endonucleases, 
topoisomerases, methylases, histone acetylases, histone deacetylases, and 
combinations thereof. 

The method of claim 24, wherein the specific endonuclease comprises one or 
more four-base restriction endonucleases, one or more six-base restriction 
endonucleases, or combinations thereof. 

The method of claim 25, wherein the four-base restriction endonuclease is 
selected from the group consisting of Sau3a, Styl, Nla m, Hsp 92, and 
combinations thereof. 

The method of claim 25, wherein the six-base endonuclease is selected from 
the group consisting of EcoRl, Hindlll, and combinations thereof. 

The method of claim 24, wherein the non-specific endonuclease is DNase I. 
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29. The method of claim 24, wherein the topoisomerase is topoisomerase II. 

30. The method of claim 1, wherein the genetic locus comprises DNA isolated 
from the group of organisms consisting of Homo sapien, rat, mouse, zebrafish, 
drosophila, yeast, C. elegans, and combinations thereof. 

3 1 . An isolated DNA hypersensitivity site identified according to the method of 
any one of claims 1-30. 

32. An regulatory sequence profile identified according to the method of any one 
of claims 1-30. 

33. A nucleotide array comprising a plurality of regulatory sequence sequences 
identified by the method of any one of claims 1-30. 

34. The nucleotide array of claim 33, wherein the array is fixed to a slide, a chip, 
or a membrane filter. 

35. The nucleotide array of claim 33, wherein one or more copies of said 
nucleotide sequences of the hypersensitive sites are spotted on said array. 

36. A method of ascertaining the effect of an agent or other environmental 
perturbation on an regulatory sequence profile of a genetic locus comprising; 

(a) obtaining a first regulatory sequence profile associated with the genetic locus, 
wherein the sample from which the regulatory sequences are identified is 
unexposed to the agent or perturbation; 

(b) obtaining a second regulatory sequence profile associated with the genetic 
locus, wherein the sample from which the regulatory sequences are identified 
is exposed to the agent or perturbation; and 

(c) comparing the first profile with the second profile to determine regulatory 
sequences that are effected by the agent perturbation. 
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37. The method of claim 36, wherein the perturbation occurs before obtaining the 
sample from a tissue, wherein the environmental perturbation is selected from 
the group consisting of an infection of the eukaryotic organism from a 
microorganism, loss in immune function of the eukaryotic organism, exposure 
of the tissue to high temperature, exposure of the tissue to low temperature, 
cancer of the tissue, cancer of another tissue in the eukaryotic organism, 
irradiation of the tissue, exposure of the tissue to a chemical or other 
pharmaceutical compound; and aging. 

38. The method of claim 36, wherein the perturbation occurs after obtaining the 
sample from a tissue, wherein the perturbation is selected from the group 
consisting of exposure of the tissue to high temperature, exposure of the tissue 
to low temperature, irradiation of the tissue, exposure of the tissue to a 
chemical or other pharmaceutical compound, and aging. 

39. The method of claim 36, wherein the perturbation is the addition of one or 
more compounds. 

40. A method for profiling differential regulatory sequence activation associated 
with a genetic locus, comprising: 

(a) obtaining multiple regulatory sequences associated with the genetic locus from 
a first population and labeling them with a first label; 

(b) obtaining multiple regulatory sequences associated with the genetic locus from 
a second population and labeling them with a second label; 

(c) hybridizing the elements from a) and the fragments from b) with a DNA 
microarray containing DNA species in separate locations that match putative 
or verified regulatory elements associated with the genetic locus; and 

(d) determining the ratio of signals from the first and second labels within the 
array. 

41 . The method of claim 40, wherein one of the populations is an untreated 
control and the other population is treated by contact with at least one agent, 

161 



WO 2004/053106 



PCTAJS2003/040070 



and the signal ratios obtained in step d) provide an indication of gene 
regulatory activity modulated by the agent. 

42. A method of identifying a gene associated with a disease or disorder, 
comprising: 

(a) comparing an regulatory sequence profile of a cell with a disease or disorder 
to an regulatory sequence profile of a normal control cell; 

(b) identifying an regulatory sequence with different activities in the two cells, 
and 

(c) identifying a gene associated with the regulatory sequence identified in step 

(b>. 

43 The method of claim 42, wherein the active chromatin profiles are associated 
with a known gene. 

44. The method of claim 42, wherein the active chromatin profiles are associated 
with a specific chromatin region. 

45. The method of claim 42, wherein the disease or disorder is a cancer. 



46. The method of claim 42, wherein the comparison is performed using an array 
of regulatory sequence sequences. 

47. The method of claim 46, wherein the array includes regulatory sequence 
sequences associated with a plurality of genes. 

48. A method of identifying an regulatory sequence of a gene, comprising: 

(a) preparing an regulatory sequence profile of a gene; and 

(b) identifying an regulatory sequence within the profile. 

49. The method of claim 48, wherein the regulatory sequence profile is prepared 
according to the method of claim 1. 
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50. A method of identifying an allelic form of a gene, comprising: 

(a) comparing an regulatory sequence profile of one cell to an regulatory 
sequence profile of a second cell, wherein the regulatory sequence profiles are 
associated with the same gene; and 

(b) identifying an regulatory sequence displaying different activities in the two 
cells. 

5 1 . The method of claim 50, further comprising obtaining the sequence of at least 
one of the identified regulatory sequences. 

52. A method of identifying a cell, comprising: 

(a) determining the regulatory sequence profile associated with a cell; 

(b) comparing the regulatory sequence profile of the cell to an regulatory 
sequence profile associated with a known cell types; and 

(c) identifying a cell type with the same or a substantially similar regulatory 
sequence profile as the cell, 

thereby identifying the cell type of the cell. 

53. The method of claim 52, wherein the comparison is performed using an array 
of polynucleotides comprising regulatory sequences. 

54. A method of detecting a disease or disorder in a subject, comprising: 

(a) identifying an regulatory sequence profile associated with a disease or 
disorder; 

(b) determining an regulatory sequence profile of a subject; and 

(c) comparing the regulatory sequence profile of the subject to the regulatory 
sequence profile associated with the disease or disorder, wherein the same or a 
similar regulatory sequence profile indicates the presence of the disease or 
disorder, and wherein the regulatory sequence profiles are associated with the 
same genetic locus. 
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A method of qualifying a patient for a clinical trial, comprising: 
identifying an regulatory sequence profile of a patient, and 
comparing the regulatory sequence profile of the patient to an regulatory 
sequence profile identified in patients suitable for a clinical trial, wherein the 
regulatory sequence profiles are associated with the same genetic locus. 

A method of selecting a therapy for a patient, comprising: 

identifying an regulatory sequence profile of a patient; 

comparing the regulatory sequence profile identified in step (a) to the 

regulatory sequence profile associated with a favorable outcome following a 

therapy; and 

selecting the therapy if the regulatory sequence profiles are the same or 
substantially similar. 

A method of predicting the outcome of a disease or treatment protocol, 
comprising: 

identifying an regulatory sequence profile of a patient; 
comparing the regulatory sequence profile identified in step (a) to the 
regulatory sequence profiles associated with one or more outcomes associated 
with a disease or treatment; and 

identifying an regulatory sequence profiles associated with an outcome 
associated with a disease or treatment that is the same or substantially similar 
to the regulatory sequence profile identified in step (a). 

A method of screening a drug candidate, comprising: 

identifying one or more regulatory sequence profiles associated with a cell 

with a disease or disorder, wherein the cell is not treated with a candidate 

drug; 

providing the candidate drug to a cell with the disease or disorder; 
identifying one or more regulatory sequence profiles associated with the cell 
provided with the candidate drug; and 
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(d) comparing the regulatory sequence profiles of steps (a) and (c) and thereby 
determining whether treatment with the candidate drug altered an regulatory 
sequence profile. 

59. A method of identifying a drug useful in treating a disease or disorder, 
comprising: 

(a) identifying an regulatory sequence profile associated with a disease or 
disorder; 

(b) treating a cell with the disease or disorder with a candidate drug; 

(c) identifying an regulatory sequence profile after treatment with the candidate 
drug, wherein the regulatory sequence profiles correspond to the same genetic 
locus; and 

(d) comparing the regulatory sequence profiles of steps (a) and (c) to determine if 
treatment with the candidate drug affected the regulatory sequence profile. 

60. A drug identified according to the method of claim 59. 

61. A method of manufacturing a drug, comprising: 

(a) identifying a drug that alters an regulatory sequence profile associated with a 
disease or disorder; and 

(b) manufacturing the identified drug. 

62. A computer readable medium comprising an regulatory sequence profile 
associated with a genetic locus. 

63. The computer readable medium of claim 62, wherein the genetic locus 
comprises an open reading frame. 

64. The computer readable medium of claim 63, wherein the open reading frame 
encodes a gene associated with a disease or disorder. 
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The computer readable medium of claim 64, wherein the disease or disorder is 
a cancer. 

The computer readable medium of claim 64, wherein the gene is selected from 
the group consisting of: p53, Rb, INK4A/pl6, CTNNB1, H-Ras, Fos, MDM2, 
INK4, ARF1, PTEN, Jun, WNT3A/14, NFkB, TERT, BRCA1, BRCA2, 
WAFl/p21, CDK4, TGF-betal, RAR, E2F, VHL, MLH1, SMAD4, SMAD2, 
SMAD3, K-Ras, EGFR, WT1, Myc, Raf, ABL, and HER2. 

The computer readable medium of claim 66, wherein the active chromatin 
profile contains the genomic position and activity of one or more regulatory 
sequences. 

The computer readable medium of claim 67, wherein the genetic locus 
comprises an open reading frame. 

A computer readable medium comprising a plurality of regulatory sequence 
profiles associated with a specific cell. 

The computer readable medium of claim 69, wherein the cell is a mammalian 
cell. 

The computer readable medium of claim 69, wherein the cell is a diseased 
cell. 

The computer readable medium of claim 69, wherein the regulatory sequence 
profiles include the genetic location and activities of at least one regulatory 
sequence. 

A computer readable medium comprising a plurality of regulatory sequence 
profiles associated with different cells. 
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74. The computer readable medium of claim 73, wherein the regulatory sequence 
profiles are associated with the same genetic locus. 

75. The computer readable medium of claim 73, wherein the regulatory sequence 
profiles include regulatory sequence profiles associated with a plurality of 
genetic loci for each cell. 

76. The computer readable medium of claim 73, wherein one or more cells is 
treated with an agent. 

77. The computer readable medium of claim 76, wherein the agent is a drug 
candidate. 

78. The computer readable medium of claim 73, wherein the cells are derived 
from different tissues. 

79. The computer readable medium of claim 73, wherein one or more cells is a 
diseased cell. 

80. A computer readable medium comprising regulatory sequence profiles for at 
least two genetic loci, wherein each locus comprises an open reading frame 
and one or more regulatory sequences associated with that gene, and wherein 
the profile includes polynucleotide sequences selected from the group 
consisting of: 

(a) sequences of open reading frames 

(b) sequences that hybridize to a an open reading frame under moderately 
stringent conditions; 

(c) degenerate sequences of open reading frames; and 

(d) sequences that hybridize to degenerate sequences of open reading frames. 

8 1 . The computer readable medium of claim 80 comprising the sequences for at 
least one gene selected from the group consisting of: p53, Rb, INK4A/pl6, 
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CTNNB1, H-Ras, Fos, MDM2, INK4, ARF1, PTEN, Jun, WNT3A/14, NFkB, 
TERT, BRCA1, BRCA2, WAFl/p21, CDK4, TGF-betal, RAR, E2F, VHL, 
MLH1, SMAD4, SMAD2, SMAD3, K-Ras, EGFR, WT1, Myc, Raf, ABL, 
andHER2. 

82. The computer readable medium of claim 80, wherein at least one regulatory 
sequence is a promoter or enhancer of transcription for a gene. 

83. A computer executable program for comparing regulatory sequence profiles of 
two or more cells, comprising: 

(a) inputting an regulatory sequence profile associated with a genetic locus in a 
first cell; 

(b) inputting an regulatory sequence profile associated with the same genetic 
locus in a second cell; and 

(c) outputting a comparison of the regulatory sequence profiles of steps (a) and 
(b). 

84. A computer executable program for the identification of a cell, comprising: 

(a) inputting an regulatory sequence profile associated with one or more genetic 
loci in a cell; 

(b) searching a data set comprising regulatory sequence profiles for the same 
genetic loci in one or more known cell types; and 

(c) outputting a cell type with the same or a substantially similar regulatory 
sequence profile as the regulatory sequence profile of step (a). 

85. A method of regulating gene expression, comprising: 

(a) identifying an regulatory sequence profile associated with a desired pattern of 
gene expression; 

(b) preparing a nucleic acid vector comprising at least a plurality of regulatory 
sequences within the profile of step (a) operably linked to a gene sequence; 
and 

(c) introducing the vector into a cell. 
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86. The method of claim 85, wherein the cell is stably introduced into the cell to 
obtain permanent heritable transmission of the regulatory sequences and 
operably linked gene sequence. 

87. The method of claim 85, wherein the gene encodes a regulatory protein. 

88. The method of claim 85, wherein the gene encodes a therapeutic molecule. 

89. The method of claim 88, wherein the therapeutic molecule is a polypeptide or 
a polynucleotide. 

90. The method of claim 89, wherein the therapeutic molecule is selected from the 
group consisting of: ribozymes, antisense RNA, double-stranded RNA, small 
interfering RNA, and short hairpin RNA. 

91. A regulatory sequence identified by the method of claim 48. 

92. An allelic variant identified by the method of claim 50. 

93. A computer executable program for profiling a genetic locus for active 
chromatin, comprising inputting data comprising regions of chromatin 
hypersensitivity sites derived from a selected cell or tissue type; comparing 
said data with data derived from the different cell or tissue type or with a 
control data set; and outputting at least one sequence associated with said 
locus or a genomic location of said active chromatin. 

94. The method of claim 93, wherein said inputted data comprises sequences of 
chromatin hypersensitive sites generated by enzymatic digestion of chromatin. 
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95. The method of claim 93, wherein said inputted data comprises sequences of 
chromatin hypersensitive sites generated by using thermostable polymerase 
amplification of preselected regions of the genome. 

96. The method of claim 93, wherein said preselected regions are within 500 kb of 
a gene known to be associated with a disease state. 



97. A computer executable program for profiling a genetic locus for allelic 

variants affecting the formation of active chromatin, comprising inputting data 
comprising regions of chromatin hypersensitivity sites derived from a selected 
mammalian cell or tissue type; comparing said data with data derived from the 
same cell or tissue type isolated from another mammal of the same species 
with a control data set representing normal or expected sequences from said 
species; and outputting at least one sequence having an allelic variant affecting 
said active chromatin formation. 

98. A regulatory profile platform comprising regulatory sequences associated with 
a plurality of genetic loci in a plurality of different cell types. 

99. A method for profiling chromatin sensitivity of a genomic region of cells of a 
cell type to digestion by a DNA modifying agent, comprising determining a 
chromatin sensitivity profile, said chromatin sensitivity profile comprising a 
plurality of replicate measurements of each of a plurality of different genomic 
sequenqes^in said genomic region, wherein each of said plurality of replicate 
measurements is a ratio of (i) copy numbers of an amplicon comprising said 
genomic sequence measured by real-time quantitative PCR (qPCR) with 
chromatin of said cell type that has been treated with said DNA modifying agent 
and (ii) copy numbers of said amplicon measured by real-time qPCR with 
chromatin of said cell type that has not been treated with said DNA modifying 
agent. 
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100. The method of claim 99, wherein said plurality of different genomic sequences 

comprises successively overlapping sequences tiled across one or more 
portions of said genomic region. 

101. The method of claim 100, wherein said plurality of different genomic sequences 

comprises successively overlapping sequences tiled across said genomic 
region. 

102. The method of claim 99, wherein each of said plurality of different genomic 

sequences has a length in the range of about 75 to about 300 bases. 

103. The method of claim 102, wherein the mean length of said plurality of different 

genomic sequences is about 250 bases. 

104. The method of claim 99, wherein said plurality of duplicate measurements 

consists of at least 3 duplicate measurements. 

105. The method of claim 104, wherein said plurality of duplicate measurements 

consists of at least 6 duplicate measurements. 

106. The method of claim 105, wherein said plurality of duplicate measurements 

consists of at least 9 duplicate measurements. 

107. The method of claim 99, further comprising determining a baseline chromatin 

sensitivity profile by a method comprising 

(a) smoothing the data in said chromatin sensitivity profile to obtain a baseline curve; 

and 

(b) determining the error bounds for said baseline curve, 

wherein said baseline curve and said error bounds constitute said baseline chromatin 
profile. 

108. The method of claim 107, wherein said smoothing is carried out using LOWES S. 
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109. The method of claim 107, wherein said error bounds are determined by a method 

comprising 

(bl) mean centering said plurality of replicates for each genomic sequence in said 
chromatin sensitivity profile about said baseline curve to generate a mean- 
centered chromatin sensitivity profile, wherein said mean-centering is carried 
out by setting the mean of each said plurality of replicates to the value of the 
corresponding genomic sequence on said baseline curve; 

(b2) determining the median M of said mean-centered chromatin sensitivity profile; 

(b3) determining the Median Absolute Deviation MAD of said mean-centered 
chromatin sensitivity profile; 

(b4) discarding for each genomic sequence replicate measurement X if X satisfy 
equation 

>2.24,and 

MAD I 0.6745 

(b5) defining the error bounds as the lower and upper confidence limits on the 
remaining data. 

110. The method of claim 107, wherein said error bounds are determined by a method 

comprising 

(bl) generating a bootstrap chromatin sensitivity profile by randomly selecting one 
replicate measurement from said plurality of replicate measurements for each 
genomic sequence; 

(b2) mean centering said plurality of replicates for each genomic sequence in said 
bootstrap chromatin sensitivity profile about said baseline curve to generate a 
mean-centered chromatin sensitivity profile, wherein said mean-centering is 
carried out by setting the mean of each said plurality of replicates to the value 
of the corresponding genomic sequence on said baseline curve; 

(b3) determining the median M of said mean-centered chromatin sensitivity profile; 

(b4) determining the Median Absolute Deviation MAD of said mean-centered 
chromatin sensitivity profile; 

(b5) discarding for each genomic sequence replicate measurement X if X satisfy 
equation 



MAD I 0.6745 



>2.24, 
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(b5) determining the maximum lower and minimum upper outliers on the remaining 
data; 

(b6) repeating said step (bl)-(b5) for a plurality of times; and 

(b7) calculating the upper and lower outlier cutoff values and Bca confidence 
intervals. 

111. The method of claims 1 09, further comprising 

(cl) identifying one or more genomic sequences among said plurality of genomic 
sequences whose 20% trimmed means lie outside said error bounds; and 

(c2) determining a signal-to-noise ratio S/N of said identified genomic sequences 
according to equation 

where S IN t is the signal-to-noise ratio at site i , HS t is the Y% trimmed mean of the 
corresponding HS cluster, B t is the value of said baseline curve at said site i, 
MAD B is the median average deviation of the centered baseline, a m is the 
average variance of replicate measurements, and a c is the variance of the 
replicate measurements at said site L 

1 12. The method of claims 1 10, further comprising 

(cl) identifying one or more genomic sequences among said plurality of genomic 
sequences whose 20% trimmed means lie outside said error bounds; and 

(c2) determining a signal-to-noise ratio S/N of said identified genomic sequences 
according to equation 

\HS t -B t \ 
MAD B {o c l<j HS Y 

where S/N { is the signal-to-noise ratio at site i, HSi is the Y% trimmed mean of the 
corresponding HS cluster, B t is the value of said baseline curve at said site i, 
MAD B is the median average deviation of the centered baseline, cr HS is the 
average variance of replicate measurements, and ar c is the variance of the 
replicate measurements at said site i. 
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113. The method of any one of claims 99-112, wherein each said copy number has 

been corrected for amplification efficiency. 

1 14. The method of any one of claims 99-1 12, wherein said DNA modifying agent is 

DNase L 

115. The method of any one of claims 99-112, wherein each of said plurality of 

duplicated measurements is measured by independent real-time qPCR 
experiments. 

116. The method of any one of claims 99-112, wherein each of said plurality of 

duplicated measurements is measured by independent real-time qPCR 
experiments using different treated chromatin samples. 

117. A method for profiling chromatin sensitivity of a genomic region of cells of a 

cell type to digestion by a DNA modifying agent, comprising 

(a) treating chromatin of cells of said cell type with said DNA modifying agent such 

that digestion of DNA occurs and retrieving DNA molecules; 

(b) amplifying a plurality of different genomic sequences in said genomic region by 

real-time quantitative PCR using at least a portion of said retrived DNA 
molecules and determining copy numbers of amplification product of each 
said genomic sequence; 

(c) amplifying said plurality of different genomic sequences in said genomic region 

by real-time quantitative PCR using DNA molecules obtained from chromatin 
of cells of said cell type that is not treated by said DNA modifying agent and 
determining copy numbers of amplification product of each said genomic 
sequence; 

(d) determining a ratio of said copy numbers measured in step (b) and copy numbers 

measured in said step (c) 

(e) repeating said steps (b) - (d) a plurality of times to generate a plurality of ratios, 

thereby generating a plurality of replicate measurements for each of said 
genomic sequences; and 
(d) determining a chromatin sensitivity profile of said genomic region, said chromatin 
sensitivity profile comprising said plurality of replicate measurements. 
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118. The method of claim 117, wherein said plurality of different genomic sequences 

comprises successively overlapping sequences tiled across one or more 
portions of said genomic region. 

119. The method of claim 118, wherein said plurality of different genomic sequences 

comprises successively overlapping sequences tiled across said genomic 
region. 

120. The method of claim 117, wherein each of said plurality of different genomic 

sequences has a length in the range of about 75 to about 300 bases. 

121. The method of claim 120, wherein the mean length of said plurality of different 

genomic sequences is about 250 bases. 

122. The method of claim 117, wherein said plurality of duplicate measurements 

consists of at least 3 duplicate measurements. 

123. The method of claim 122, wherein said plurality of duplicate measurements 

consists of at least 6 duplicate measurements. 

124. The method of claim 123, wherein said plurality of duplicate measurements 

consists of at least 9 duplicate measurements. 

125. The method of claim 117, further comprising determining a baseline chromatin 

sensitivity profile by a method comprising 

(a) smoothing the data in said chromatin sensitivity profile to obtain a baseline curve; 

and 

(b) determining the error bounds for said baseline curve, 

wherein said baseline curve and said error bounds constitute said baseline chromatin 
profile. 

126. The method of claim 125, wherein said smoothing is carried out using LOWESS. 
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127. The method of claim 125, wherein said error bounds are determined by a method 

comprising 

(bl) mean centering said plurality of replicates for each genomic sequence in said 
chromatin sensitivity profile about said baseline curve to generate a mean- 
centered chromatin sensitivity profile, wherein said mean-centering is carried 
out by setting the mean of each said plurality of replicates to the value of the 
corresponding genomic sequence on said baseline curve; 

(b2) determining the median M of said mean-centered chromatin sensitivity profile; 

(b3) determining the Median Absolute Deviation MAD of said mean-centered 
chromatin sensitivity profile; 

(b4) discarding for each genomic sequence replicate measurement X if X satisfy 
equation 

\X-M\ 

— ! ! — > 2.24 , and 

MAD 10.61 45 

(b5) defining the error bounds as the lower and upper confidence limits on the 
remaining data. 

128. The method of claim 125, wherein said error bounds are determined by a method 

comprising 

(bl) generating a bootstrap chromatin sensitivity profile by randomly selecting one 
replicate measurement from said plurality of replicate measurements for each 
genomic sequence; 

(b2) mean centering said plurality of replicates for each genomic sequence in said 
bootstrap chromatin sensitivity profile about said baseline curve to generate a 
mean-centered chromatin sensitivity profile, wherein said mean-centering is 
carried out by setting the mean of each said plurality of replicates to the value 
of the corresponding genomic sequence on said baseline curve; 
(b3) determining the median M of said mean-centered chromatin sensitivity profile; 
(b4) determining the Median Absolute Deviation MAD of said mean-centered 

chromatin sensitivity profile; 
(b5) discarding for each genomic sequence replicate measurement X if X satisfy 
equation 
\X-M\ 



MAD I 0.6745 



>2.24, 
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(b5) determining the maximum lower and minimum upper outliers on the remaining 
data; 

(b6) repeating said step (bl)-(b5) for a plurality of times; and 

(b7) calculating the upper and lower outlier cutoff values and Bca confidence 
intervals. 

129. The method of claims 127, further comprising 

(cl) identifying one or more genomic sequences among said plurality of genomic 
sequences whose 20% trimmed means lie outside said error bounds; and 

(c2) determining a signal-to-noise ratio S/N of said identified genomic sequences 
according to equation 

SIN, = \ HS i~ B i\ 

MAD B {aJ<j HS Y 

where S fN t is the signal-to-noise ratio at site i, HS t is the Y% trimmed mean of the 
corresponding HS cluster, B t is the value of said baseline curve at said site z, 
MAD B is the median average deviation of the centered baseline, <j hs is the 
average variance of replicate measurements, and a c is the variance of the 
replicate measurements at said site L 

130. The method of claims 127, further comprising 

(cl) identifying one or more genomic sequences among said plurality of genomic 
sequences whose 20% trimmed means he outside said error bounds; and 

(c2) determining a signal-to-noise ratio S/N of said identified genomic sequences 
according to equation 

slNi= \HS t -B t \ 

where S /N, is the signal-to-noise ratio at site i , HS ( is the Y% trimmed mean of the 
corresponding HS cluster, B t is the value of said baseline curve at said site i, 
MAD B is the median average deviation of the centered baseline, <y m is the 

3 nS 

average variance of replicate measurements, and <j e is the variance of the 
replicate measurements at said site i. 
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131. The method of any one of claims 117-130, wherein each said copy number has 

been corrected for amplification efficiency. 

132. The method of any one of claims 117-131, wherein said DNA modifying agent is 

DNase I. 



133. The method of any one of claims 117-130, wherein each of said plurality of 

duplicated measurements is measured by independent real-time qPCR 
experiments. 

134. The method of any one of claims 117-130, wherein each of said plurality of 

duplicated measurements is measured by independent real-time qPCR 
experiments using different treated chromatin samples. 

135. The method of any one of claims 111-112 and 129-130, wherein said Y% 

trimmed mean is 20% trimmed mean. 
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